t  ■  ; - — — : 

AD-A129  109  ON  THE  USE  OF  STAGEWISE  REGRESSION  IN  RANDOM  BALANCE  1/  I 

SCREENING  EXPERIMENTS(U)  DESMATICS  INC  STATE  COLLEGE  PA 
C  A  MAURO  MAY  83  TR-113-8  N00014-79-C-0650 


UNCLASSIFIED 


F/G  12/1 


NL 


ON  THE 

USE  OF  STAGEWISE 

! 

REGRESSION 

IN  RANDOM 

BALANCE  SCREENING  EXPERIMENTS 

by 

Carl  A.  Mauro 

•  ■  u*  wfTJr.t '  '  •  •  " 

•'  -»»•  ■•»*-••  7  r  :•* ^ 

•* .  ?• 

1  •<<  .  .  •  .‘5. 

•*  ">»  *a£*.  ,  ^  7 

Si 


jmo*** 


.•rz.-ys.wrf.i:-"  d  .4  <„■ 


6  09  024  3 


DESMATICS,  INC. 


P.  O.  Box  618 

Stata  Collage,  PA.  16801 

Phono:  (814)  238-9621 


>x 


Applied  Research  in  Statistics  -  Mathematics  -  Operations  Research 


ON  THE  USE  OF  STAGEWISE  REGRESSION 
IN  RANDOM  BALANCE  SCREENING  EXPERIMENTS 


Carl  A.  Mauro 


TECHNICAL  REPORT  NO.  113-8 


May  1983 


This  study  was  supported  by  the  Office  of  Naval  Research 
under  Contract  No.  N00014- 7 9-C-0650 ,  Task  No.  NR  042-467 


Reproduction  in  whole  or  in  part  is  permitted 
for  any  purpose  of  the  United  States  Government 

Approved  for  public  release;  distribution  unlialte< 


DTTC 

^LECTEl 

JUNO  9031 


TABLE  OP  CONTENTS 


-i- 


I.  INTRODUCTION  AND  BACKGROUND 


Random  Balance  (RB)  Is  a  design  technique  that  may  have  much  to 
offer  the  researcher  planning  a  factor  Screening  experiment.  The  RB  con¬ 
cept  is  most  useful,  however.  In  the  design  of  supersaturated  screening 
experiments.  An  experiment  is  supersaturated  when  the  number  of  factors 
(l.e.,  design  variables)  under  investigation  exceeds  the  number  of  runs 
available.  As  it  is,  screening  experiments  are  often  handicapped  by  the 
scarcity  of  experimental  runs  because  of  time,  budget,  and/or  resource 


limitations.  ^Wtf"are  concerned  in  this  paper  with  the  supersaturated  situ¬ 
ation. 

"  In  RB  designs,  unlike  more  conventional  designs,  no  mathematical  re¬ 
lation  or  restriction  need  exist  (except  that  an  even  number  of  runs  be 
used)  between  the  sample  size  N  and  the  number  of  factors  K  under  consider¬ 
ation.  Because  of  this  flexibility,  the  RB  techniqe  permits  the  researcher 
to  screen  a  large  (or  small)  number  of  possible  contributing  factors  in  an 
experiment  involving  a  limited  (N<K)  number  of  test  runs.  Another  advantage 
is  that  RB  designs  are  easy  to  prepare  for  any  combination  of  N  and  K. 

'' — — >  A  major  concern  with  RB  experimental  design  is  that  there  are  no  spec¬ 
ific  or  unique  statistical  techniques  for  analyzing  RB  designs.^  (See  [5] 
and  [6]  for  a  more  complete  discussion.)  There  is  no  one  particular  method, 
therefore,  that  ought  to  be  used  to  analyze  RB  screening  experiments.  Sat- 
terthwaite  [5]  has  remarked  that  practically  any  technique  used  to  analyze 
data  without  RB  properties  can  be  applied  to  any  (suitably  small)  subset 
of  factors  in  an  RB  design.  The  simplest  approach,  then,  would  be  to  con¬ 
sider  each  factor  separately  and  apply  some  standard  test  of  significance. 
Accordingly,  Mauro  and  Smith  [4]  have  considered  the  use  of  a  standard  F- 


test  applied  separately  to  each  factor  as  the  method  of  analysis  for  RB 
designs. 

A  more  sophisticated  means  of  analysis  which  is  considered  by  Ans- 
combe  [ l]  and  Budne  [2]  is  as  follows.  We  first  determine  the  factor, 
say  x^,  most  highly  correlated  with  the  response  variable  Y.  After  a 
simple  regression  equation  in  x^  has  been  fit,  the  residuals  Y-$(x^) 
are  found.  These  residuals  are  now  considered  as  response  values  and  the 
process  is  repeated.  We  stop  when  we  reach  the  stage  where  the  regression 
on  the  most  correlated  variable  is  not  significant.  Of  course,  once  a 
factor  has  been  adjusted  for  (i.e.,  entered),  it  is  not  considered  as  part 
of  the  variable  pool  in  subsequent  stages. 

The  analysis  procedure  just  described  has  been  known  under  a  variety 
of  descriptive  titles.  We  will  refer  to  it  here  as  "stagewise  regression," 
which  is  the  terminology  used  by  Draper  and  Smith  [3].  We  should  emphasize 
that  the  stagewise  regression  (SR)  solution  is  not  the  multiple  least 
squares  solution  for  the  variables  Involved.  This  is  because  at  each  stage 
of  the  SR  procedure  the  remaining  factors  are  not  adjusted  for  previously 
entered  factors. 

The  purpose  of  this  technical  report  is  to  investigate  the  use  of  ■SSr  f  !J' 

'I  ->  x.  ‘ W  *  1 

as  a  method  of  analysis  for  RB  screening  experiments.  OuT  approach  is  to 
determine  the  efficiency  of  the  first  two  stages  in  order  to  obtain  an  in¬ 
dication  of  what  can  occur  between  consecutive  stages.  In  doing  so,  the 
SR  method  is  compared  with  -ther  individual  F-test  approach, as  considered 
previously  by  Mauro  and  Smith,  Finally,  two  Monte  Carlo  case  studies  are 
conducted. 


II.  A  SCREENING  MODEL 


When  evaluating  the  performance  of  a  screening  strategy,  one  must 
consider  both  how  many  runs  are  required  and  how  accurately  factors  are 
identified.  Although  the  factors  may  range  in  importance  from  highly 
critical  to  negligible,  we  generally  classify  factors  as  either  "important" 
or  "unimportant".  The  factors  deemed  important  are  usually  investigated 
more  intensively  in  subsequent  experimentation. 

In  order  to  provide  a  common  statistical  basis  to  evaluate  and  com¬ 
pare  screening  methods,  we  must  make  some  assumptions  regarding  a  general 
screening  model.  First  of  all,  we  assume  that  each  factor  is  assigned  or 
has  two  levels,  high  (+1)  and  low  (-1).  Using  two  levels  for  each  factor 
is  generally  sufficient  for  screening  purposes.  Second,  for  detecting 
the  factors  having  major  effects  it  is  usually  reasonable  to  assume  an 
additive  model.  Thus,  we  assume  the  model: 


(2.1) 


where  is  the  value  of  the  response  in  the  i  run;  x^  ■  il  depending  upon 

the  level  of  the  j**1  factor  in  the  i^  run;  is  the  (linear)  effect  of  the 

j^  factor;  and  the  error  terms  are  independent  and  normally  distributed 

2 

with  zero  mean  and  variance  a  . 

In  essence,  model  (2.1)  is  a  first-order  Taylor  series  approximation 
to  the  actual  relationship  between  the  response  and  the  experimental  fac¬ 
tors;  ordinarily,  we  would  assume  model  (2.1)  over  a  relatively  small  re¬ 


gion  of  the  factor  space.  We  will  restrict  performance  assessment  to  this 


In  matrix  terras  we  can  write  model  (2.1)  compactly  as  y  ■  8q1 4  X0  -f  e 
where  JL  is  an  Nxl  vector  of  +l's,  m  (y^)  is  an  Nxl  vector  of  responses, 
e.  ■  (e^  is  an  Nxl  vector  of  error  terms,  ]5»  (6^)  is  a  Kxl  vector  of  fac¬ 
tor  effects,  and  X*  (x^)  is  an  NxK  design  matrix. 

In  an  RB  design,  the  design  matrix  X  is  stochastic.  Specifically,  in 
a  two-level  (il)  RB  design  each  column  of  the  design  matrix  consists  of 
N/2  +l*s  and  N/2  -l's  where  N  (an  even  number)  denotes  the  number  of  runs. 

The  +l*s  and  -l's  in  each  column  are  assigned  randomly,  making  all  possible 
combinations  of  N/2  +l's  and  N/2  -l's  (there  are  *11)  equally  likely, 

with  each  column  receiving  an  independent  randomization.  Factors  are  there¬ 
fore  confounded  to  a  random  degree.  Moreover,  we  cannot  generally  control 
the  amount  of  confounding  or  interdependence  between  factors. 


III.  THE  STAGEWISE  REGRESSION  METHOD 


In  this  section  we  attempt  to  gsln  some  understanding  of  the  be¬ 
havior  of  SR  when  used  as  the  method  of  analysis  for  RB  screening  ex¬ 
periments.  To  obtain  an  indication  of  the  possible  benefits  of  SR,  we 
derive  an  expression  for  the  relative  efficiency  of  the  second-stage  to 
the  first-stage  estimator  of  a  factor  effect.  A  comparison  of  the  first 
two  stages  should  provide  some  Indication  of  what  can  happen  in  SR  and 
what  might  be  gained  (or  lost)  in  general  by  the  stagewise  procedure. 

To  begin,  the  first-stage  estimator  of  6^  is  denoted  by  8^  and  is 
given  by 

(3.1) 

where  Xj  denotes  the  jth  column  vector  of  the  design  matrix.  Correspond¬ 
ingly,  the  second-stage  estimator  of  is  denoted  by  6^(1^)  ,  for  j  + i^  , 
and  is  given  by 

*x^(i1)/N  ,  (3.2) 

where 

id1)  -jr-yl-8  X.  (3.3) 

1  X1 

and  i^  denotes  the  index  of  the  factor  showing  the  largest  effect  in  the 
first  stags  of  the  procedure.  The  vector  ^(i^)  is  the  vector  of  first- 
stage  residuals. 

Substituting  (3.3)  into  (3.2),  we  see  that 

(3-4) 
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where  •  In  RB  designs  the  variable  is  the  sample  cor¬ 

relation  coefficient  for  and  and  is  a  measure  of  the  orthogonality 
between  the  two  respective  design  columns. 

The  estimator  of  0^  as  defined  in  (3.1)  is  precisely  the  estimator 
considered  by  Mauro  and  Smith  [4}  under  the  individual  F-test  approach  for 
analyzing  RB  experiments.  Mauro  and  Smith  have  shown  that 

ECgj)  -  (3.5) 

and  V(V“  ('r2"Sj)/<N-1>+cr2/N  »  (3.6) 

2  2  a 

where  r  .  Although  0.  is  an  unbiased  estimator  of  0. ,  its  variance 

®  j  3 

can  be  seriously  Inflated.  The  basic  idea  behind  the  use  of  SR  is  to  re¬ 
duce  the  effect  of  the  inflation  by  adjusting  for  those  factors  which  appear 
to  have  large  effects. 

Regarding  the  estimator  0^(1^),  we  have  from  (3.4)  and  (3.5)  that 

Elfy^)] -Bj-EtB^r^J  .  (3.7) 

In  the  Appendix  we  show  for  i.  f*  j  that  E[0,  r  ]  -0  /(N-l)  ,  so  that 

1  nl  3 

•8jI(N-2)/(N-l)]  .  (3.8) 

The  estimator  0^(i^)  is  therefore  slightly  biased  for  0^  .  We  can  easily 
remove  the  bias  by  considering  the  modified  estimator 

8J*(i1)  -gj(i1)[(N-l)/(N-2)].  (3.9) 

Since  0^  (1,)  and  0^  are  both  unbiased  estimators  of  0^  ,  it  is  mean¬ 
ingful  to  compare  their  respective  variances.  That  is,  we  wish  to  calculate 
the  efficiency  of  0j*(i^)  relative  to  0^  . 


Accordingly,  we  define  the 


measure 


EFF  •  VlPj*  (ij) ] /V[8j ] 


(3.10) 


.  ? 


This  ratio  measures  the  amount  of  Information  supplied  by  8,j  relative  to 

A  * 

that  supplied  by  (i^)  • 

Applying  the  results  given  in  the  Appendix,  it  is  easily  shown  that 
the  variance  of  $^*(1^)  •  conditional  on  i^  *  i,  is  given  by 

V,  <[&.*(!.)]-  (t2-0J-8?)/(N-2)+2b5/N(N-3)+o2(H-1)/N(N-2)  . 

*1  1  j  1  1  3  j  (3.11) 


The  efficiency  measure  defined  in  (3.10)  requires  the  unconditional  vari- 

•jlf 

ance  of  (i^)<  however.  In  other  words,  we  must  evaluate  (3.11)  over 
variation  in  i^  .  Unfortunately,  we  have  found  this  problem  to  be  intrac¬ 
table.  Nevertheless,  equation  (3.11)  is  still  useful  to  our  analysis  of 
the  first  two  stages  of  the  SR  method. 

With  some  algebraic  manipulation  we  can  show  that,  given  i^  ■  i  , 


where 


EFF  ■  [(N-l)/(N-2)][l  +  $] 

26?(N-2)  -BjN(N-3> 

0  >  — J - - - — - 

N(N-l)(N-3)V(6j) 


Thus,  given  ij,  »  i  and  for  N  large,  EFF  is  approximately 


EFF  *  1  +  (28* /N  -  g2)  /(t2  -  6*  +  o2)  . 


We  see  from  (3.14)  that  EFF£l  ,  if  and  only  if. 


I  MB,  |  <.(N/2)*S  . 
J  X1 


*  *. 


(3.12) 


(3.13) 


(3.14) 


(3.15) 


That  is,  SjUj)  is  a  more  efficient  estimator  of  8j  than  Sj  as  long  as 
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A  A  A 

(3.15)  holds.  If  (1^)  Is  to  be  uniformly  more  efficient  than  8^  ,  then 
(3.15)  must  hold  for  every  j;  equivalently,  we  require 


The  term 


(3.16) 


(3.17) 


appearing  in  (3.14)  represents  the  gain  (if  (3.17)  is  negative)  or  loss 


(if  (3.17)  is  positive)  of  efficiency  in  the  second  stage  given  that  i  *i. 

2  1 
2  c 

Most  likely  the  denominator  in  (3.17)  will  be  dominated  by  T  *  . 

in 

2 

When  (3.15)  holds,  the  numerator  in  (3.17)  is  likely  dominated  by  B.  ,  thus 

11 

the  gain  in  efficiency  is  roughly 


6?  /t2  .  (3.18) 

X1 

2 

It  is  apparent  from  (3.18)  that  unless  the  contribution  of  8.  is  large 

2  1 

relative  to  the  total  effect  (x  ) ,  there  is  little  gain  in  efficiency. 

It  is  interesting  to  note  that  the  maximum  loss  of  efficiency  occurs 

when  8.  *0,  that  is,  when  the  factor  showing  the  largest  effect  in  the 
*1 

first  stage  actually  has  no  effect  whatsoever.  In  this  case,  the  loss  of 
efficiency  is  roughly 

2  B  2/(H(x2  -  6j) 1  (3.19) 

In  s unwary ,  our  analysis  indicates  that  if  the  actual  effect  of  the 
factor  showing  the  largest  effect  (  in  the  first  stage)  is  sufficiently 
large,  then  we  can  obtain  improved  estimates  of  factor  effects  in  the  sec¬ 
ond  stage.  This  observation  is  clear  from  equations  (3.15)  and  (3.16). 
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The  extent  of  the  improvement,  however,  may  be  slight  depending  on  the 
relative  contribution  of  the  apparently  largest  effect  to  the  total  ef¬ 
fect. 

Our  analysis  takes  on  additional  meaning  considering  that  the  (SR) 
first-stage  estimation  procedure  is  identical  to  the  separate  F-test  es¬ 
timation  procedure  considered  by  Mauro  and  Smith  [4].  Our  discussion, 
then,  provides  some  preliminary  indication  of  how  these  two  alternative 
analysis  techniques  would  compare.  Admittedly,  the  results  derived  in 
this  section  do  not  completely  answer  the  question  of  which  procedure  is 
preferable,  nor  do  they  provide  a  conclusive  overall  picture  of  the  multi¬ 
stage  SR  method.  However,  the  results  do  indicate  in  which  situations  the 
difference  is  likely  to  be  worth  considering.  To  gain  further  insight  into 
this  problem  we  conducted  two  Monte  Carlo  case  studies,  the  results  of 
which  are  presented  and  discussed  in  the  next  section. 


IV .  MONTE  CARLO  RESULTS 


In  this  section  we  consider  two  synthesized  examples  in  which  all  the 
true  effects  are  known  beforehand.  In  both  examples  we  assume  that  020 
factors  are  to  be  screened  in  an  RB  screening  experiment  having  N*12  runs. 

We  simulated  each  test  case  300  times  and  analyzed  the  test  results  of 
each  simulation  with  both  the  SR  and  the  separate  F-test  (SFT)  methods. 

The  distributions  of  factor  effects  used  in  each  case  study  are  given  in 
Figures  1  and  2. 

The  absolute  effects  selected  for  Case  Study  I  are  basically  (negli¬ 
gible  effects  were  grouped)  the  expected  order  statistics  from  a  sample  of 
20  deviates  from  a  ganma  distribution  having  mean  .5oand  standard  deviation 
1.58a.  The  absolute  effects  selected  for  Case  Study  II  are  basically  the 
expected  order  statistics  from  a  sample  of  20  exponential  random  deviates 
having  mean  and  standard  deviation  1.0  a. 

In  applying  the  SFT  method  we  conducted  each  F-test  at  the  same  sig¬ 
nificance  level,  a  .  We  tested  for  significance  at  the  following  eight 
a  levels:  .05,  .40  (.05)  .  These  same  a  levels  were  used  for  determining 
the  stopping  rules  in  the  SR  method.* 

The  results  of  Case  Study  I  are  summarized  in  Table  1.  We  see  from 
this  table  that  the  observed  significance  levels  associated  with  the  SFT 
method  agree  closely  with  the  various  a  levels  employed.  The  observed  sig¬ 
nificance  levels  associated  with  the  SR  method,  however,  are  significantly 
larger  than  the  a  levels  that  define  the  stopping  rules.  This  problem  is 

*We  stop  at  the  stage  where  the  maximum  F-statlstic  does  not  exceed  the 
upper  100(1 -a)  percentage  point  of  an  F-dlstribution  having  1  and  (N-2) 
degrees  of  freedom. 


not  unique  to  the  SR  method,  but  is  often  found  with  other  sequential 
variable  selection  procedures.  However,  it  complicates  the  application  of 
the  SR  method  in  that  it  is  difficult  to  control  the  risk  of  declaring  im¬ 
portant  a  factor  having  negligible  effect. 

It  is  quite  clear  from  Table  1  that  for  strategies  (i.e.,  columns) 
having  comparable  empirical  Type  I  error  rate,  we  obtain  substantially 
greater  power  with  the  SR  method  than  with  the  SFT  method,  particularly  for 
|8|/o*0.7,  1.2,  and  2.3.  In  detecting  the  largest  effect,  |f5|/a«5.3, 
both  methods  were  highly  accurate.  In  fact,  from  Table  2  we  see  that  this 
particular  effect  was  entered  at  the  first  stage  of  the  SR  method  in  each 
of  the  300  simulations.  The  next  largest  effect,  | B | /o  ■  2.3  ,  was  entered 
at  the  second  stage  in  242  of  the  300  simulations. 

The  results  of  Case  Study  II  are  summarized  in  Table  3.  We  note  that 
the  same  observations  made  in  Case  Study  I  regarding  the  observed  signifi¬ 
cance  levels  also  apply  to  Case  Study  II.  We  do  not,  however,  always  ob¬ 
tain  greater  power  with  SR  strategies  than  with  SFT  strategies  having  com¬ 
parable  empirical  Type  I  error  rate.  We  see  instead  that  the  SFT  method  is 
more  powerful  for  detecting  the  larger  effects  ( | 6 1 /o  >1.5)  and  the  SR 
method  is  more  powerful  for  detecting  the  moderate  to  smaller  effects.  We 
can  offer  two  reasons  for  this  based  on  our  analysis  made  in  Section  III. 
First,  we  can  expect  the  SR  method  to  be  more  sensitive  to  the  relatively 
small  effects  than  the  SFT  method  (and  this  is  true  in  general)  because  the 
chance  that  (3.15)  is  true  is  greater  for  smaller  effects.  Thus,  a  gain  in 
efficiency  will,  more  often  than  not,  be  propagated  through  the  stagewise 
procedure.  Second,  for  the  particular  set  of  effects  used  in  the  second 
case  study,  the  larger  effects  are  not  always  entered  early  in  the  SR  pro¬ 
cedure.  This  is  evident  from  Table  4.  We  see  from  this  table  that  there 


is  a  one-in-three  chance  that  the  effect  showing  the  largest  effect  in  the 
first  stage  will  actually  be  less  than  1.5  o  in  absolute  magnitude.  In  the 
second  stage  the  chance  of  this  occurring  is  one  in  two.  Thus,  for  the 
larger  effects  a  loss  of  efficiency  is  often  being  propagated.  As  a  con¬ 
sequence,  the  SFT  method  shows  greater  power  for  detecting  the  relatively 
large  effects. 

One  final  observation  may  be  made.  An  easy  calculation  shows  that 
2  2 

t  *35.51  and  x  *  35.77  in  Case  Studies  I  and  II,  respectively.  The  rela¬ 
tive  contribution  of  the  largest  absolute  effect  to  the  total  effect  is 
therefore  (5.3)^/35.51  *  .79  in  Case  Study  I  and  (3.55)^/35.77**35  in  Case 
Study  II.  The  larger  relative  contribution  of  the  largest  effect  in  Case 
Study  I  implies  there  is  a  greater  chance  in  Case  Study  I  than  in  Case  Study 
II  of  selecting  the  largest  effect  in  the  first  stage  of  the  SR  method. 
Moreover,  it  Indicates  that  there  is  greater  potential  gain  in  efficiency. 
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V.  CONCLUSIONS 


Although  SS  is  a  more  sophisticated  analysis  technique  than  the  SFT 
method,  there  are  situations  in  which  the  SFT  method  has  greater  power  for 
detecting  the  larger  effects.  Computationally,  both  methods  are  relatively 
quick  and  easy  to  apply.  The  key  to  SR  is  early  detection  of  the  relatively 
large  effects.  If  the  most  critical  factors  are  not  entered  early  in  the 
stagewise  procedure,  the  possibility  of  their  nondetection  is  Increased. 

It  is  precisely  this  type  of  scenario  where  SR  will  be  less  efficient  than 
the  SFT  analysis  method. 

The  most  favorable  situation  to  the  SR  method  is  when  only  a  relatively 
small  number  of  factors  are  responsible  for  all  or  much  of  the  total  effect. 
In  such  cases  the  difference  in  effectiveness  between  the  SR  and  SFT  methods 
is  likely  to  be  large.  A  drawback  to  the  SR  method,  as  in  most  sequential 
selection  procedures,  is  that  it  is  difficult  to  control  the  true  signifi¬ 
cance  level  of  the  test.  For  example,  in  the  Monte  Carlo  case  studies  pre¬ 
sented  in  Section  IV,  the  actual  value  of  ot  was  roughly  50Z  greater  than 
the  "entry  a." 
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VI.  APPENDIX 


In  this  section  we  state  three  key  results  that  were  used  to  establish 
equations  (3.8)  and  (3.11).  A  proof  is  provided  for  the  first  result  only. 

Result  #1:  For  i^j,  ElB^r^  J  -  BJ/(N-1)  . 

A  2  h  o  2 

Proof.  Note  that  -x^^xJx^/N  -  £  B^xx^/n  +  * 

0*1 

9  k 

Now.  E[B4r  J-(IAT)  EB^E [x^x^x^Xj  ]  .  since  Elx^ex^  1  "  0  •  For  i*‘J» 

2  2 

E[x;xmxlxj  1  *0  unless  ■  »  j  .  Thus.  ElB^r^j]  “  (1/N  )BjE[x^x^]  “ 

(1/N2)B  (N2/(N-1))  «  B./(N-1)  .  To  obtain  the  desired  result,  observe 

J  J 

that  for  jJU1,  E[8i  rJt  ]  - B|El01  rJ±^ |ix  -  i]) -E[Bj/(N-l))  -  8j/(N-l)  . 


Result  #2:  For  i*j,  Elg^r  )  -  (B2  +  B2) /(N-l)  +  (t2  -  B2  -  B2) /(H-l)2  + 
o2/N(N-1). 


Result  1 3:  For  if*  J,  EtB^rJ]  -  a2/N(N-l)  +  bJ/(N-1)+(t2  -  B2  -  B2)/(N-1)2  + 
2 

6„B.  ,  where 
«  J 


eN-  (3N-8)/N(N-l)(N-3)  . 


VII.  REFERENCES 


[1]  Anscombe,  F.  J.,  "Quick  Analysis  Methods  for  Random  Balance  Screening 
Experiments,"  Technometrics,  Vol.  1,  pp.  195-209,  1959. 

[2]  Budne,  T.  A.,  "The  Application  of  Random  Balance  Designs,”  Technomet¬ 
rics,  Vol.  1,  pp.  139-155,  1959. 

[3]  Draper,  N.  R. ,  and  Smith,  H. ,  Applied  Regression  Analysis,  Second  Edi¬ 
tion,  John  Wiley  &  Sons,  Inc.,  New  York,  1981. 

[4]  Mauro,  C.  A.,  and  Smith,  D.  E. ,  "Factor  Screening  in  Simulation:  Evalu¬ 
ation  of  Two  Strategies  Based  on  Random  Balance  Sampling,"  Technical 
Report  No.  113-7,  Desmatics,  Inc.,  1982.  (Also  to  appear  in  Management 
Science. ) 

[5]  Satterthwaite,  F.  W.,  "Random  Balance  Experimentation,"  Technometrics, 
Vol.  1,  pp.  111-137,  1959. 

[6]  Youden,  W.  J.,  Kempthorne,  0.,  Tukey,  J.  W. ,  Box,  6.  E.  P.,  and  Hunter, 
J.  S.,  "Discussion  of  the  Papers  of  Messrs.  Satterthwaite  and  Budne," 
Technometrlc8,  Vol.  1,  pp.  157-193,  1959. 


CO 

CO 

00 

r*. 

CO 

co 

o 

ON 

m 

On 

CM 

00 

»H 

o 

CO 

co 

CO 

*0 

o 

On^^OONiCQ 

•»  ifl  m  m  >a  n  o>  O 


.-©u'lesr't'C'so 


o 

m 

in 

CO 

co 

o 

O 

(d 

00 

o 

H 

in 

sO 

o 

H  _ 

»n 

m 

sO 

Os 

o 

at  *d 

V 

• 

• 

• 

• 

• 

• 

fH 

o  u 
■H  cO 

m 

co 

CO 

o 

CO 

ON 

m 

OS 

o 

Ox 

CM 

CM 

CM 

CM 

CO 

co 

>0 

—  Mrr-.cnr^ocno 
2<^s»mi-tcsimo 
<r'^<f^'0N0'O 


sO 

o 

co 

a 

o 

CO 

© 

o 

<r 

in 

o 

CM 

CM 

CM 

CM 

CO 

m 

o 

ON 

r* 

o 

CO 

co 

rs. 

00 

o 

h* 

CO 

CO 

mt 

m 

NO 

co  cm 

(7\  vO  ^ 


co  ©  o 

oo  oo  o 

04  'O’  O 


CO 

© 

o 

CO 

© 

H 

rH 

co 

Os 

<T 

CM 

o 

co 

CO 

co 

>o 

sO 

ON 

o 

<U  _ 

•h  *d 

ja  o>  . 

CO  M 

h  «ur 


v 

®irtr^rr*rno°° 

f^rH»-lr-I^ICMCn 


8 

9* 

0) 

in 

r* 

CM 

CM 

sO 

CO 

CM 

o 

H 

»H 

g 

o 

ad 

rH 

• 

CM 

CM 

CM 

• 

SO 

• 

ON 

• 

o 

• 

MT 

o 

in 

o 

CO 

CO 

o 

On 

On 

ON 

00 

ON 

© 

o 

• 

o 

• 

o 

• 

o 

• 

rH 

• 

CM 

• 

© 

• 

CO 

O 

o 

o 

CO 

o 

m 

ON 

•o 

oo 

00 

o 

rH 

rH 

«*■) 

to 

00 

o 

O  /— N 

•  a)  • 
H  P  O 

ss 

>s  «  w 

■OHO 
3  O 
*J  *->  c*> 
WO'' 
<U 

(U  «M  <3. 
CO  H  I 

OWrl 

U  W 

c  ax 

U  0)  ■— 
O  > 

«-i  -rl  ►, 

O  « 

a> 

xi  xi  e 
h  co  a> 
3  f.  > 
a  H  -h 
<u  o 

OO  I' 

«X  0> 


% 

%  . 


^Oc>coo©c«icn© 

jJnin'Oino'HO 

,©OOOOCNO 


-  #  c>  m  m  in  co 

£  i"  co  ©  cm  im  in 

°  o  ©  »H  cm  >a-  oo 


o 

at  a> 
>»u  u 
U  m  a 


s  «  a 
w  m  w 


qI  CM  CM  CM  H  i-l  H 

Z|  H 


o  CM  CM  CM  1-1  rM  f« 

Z|  H 


OOHflNNfin 

Sfc  •«••••• 

—  ©©©©<-*  CM  W 


OOHcnr'csc’icn 
—  o  o  o  o  h  cm  m 


I 


Number  of  Simulations 
Entered  at  First  Stage 


Number  of  Simulations 
Entered  at  Second  Stage 


lBl/q  No. 

0.0  12 

0.1  2 

0.3  2 

0.7  1 

1.2  1 

2.3  1 

5.3  1 


0 

0 

0 

0 

0 

0 

300 


25 

5 

6 
9 

13 

242 

0 


Table  2:  Observed  Counts  for  Entering  (i.e.,  showing  largest  effect) 
at  First  and  Second  Stages  of  SR  Method  for  Case  Study  I. 
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spam  T-TESTS  METHOD 


a  Uv»l 


tat  /a 

Ho. 

.05 

.10 

.15 

.20 

.25 

.30 

.35 

.40 

0.00 

3 

.043 

.091 

.153 

.209 

.258 

.303 

.352 

.403 

0.30 

2 

.067 

.117 

.170 

.220 

.260 

.312 

.352 

.388 

0.40 

2 

.077 

.117 

.163 

.220 

.275 

.312 

.363 

.410 

0.30 

1 

.043 

.100 

.150 

.197 

.257 

.303 

.360 

.413 

0.60 

1 

.073 

.127 

.177 

.230 

.287 

.327 

.377 

.447 

0.66 

1 

.047 

.113 

.173 

.237 

.277 

.313 

.357 

.427 

0.76 

1 

.077 

.140 

.203 

.230 

.277 

.327 

.373 

.430 

0.87 

1 

.083 

.147 

.200 

.247 

.317 

.410  . 

.467 

.510 

1.00 

1 

.083 

.140 

.210 

.270 

.320 

.363 

.413 

.473 

1.14 

1 

.073 

.143 

.203 

.257 

.317 

.363 

.410 

.470 

1.31 

1 

.100 

.163 

.230 

.280 

.353 

.410 

.473 

.513 

1.51 

1 

.133 

.207 

.267 

.333 

.397 

.443 

.507 

.540 

1.75 

1 

.103 

.193 

.253 

.303 

.360 

.413 

.480 

.510 

2.08 

1 

.210 

.313 

.373 

.443 

.517 

.593 

.647 

.690 

2.57 

1 

.330 

.477 

.557 

.613 

.663 

.713 

.750 

.780 

3.55 

1 

.530 

.663 

.743 

.810 

.847 

.893 

.907 

.933 

SR  METHOD 

lftlig _ 

No. 

.05 

.10 

a  Laval 

.13 

.20 

.25 

.30 

.35 

.40 

0.00 

3 

.084 

.169 

.227 

.296 

.364 

.408 

.451 

.508 

0.30 

2 

.102 

.185 

.263 

.325 

.383 

.420 

.495 

.540 

0.40 

2 

.092 

.198 

.273 

.362 

.417 

.482 

.527 

.568 

0.50 

1 

.080 

.200 

.287 

.373 

.433 

.493 

.527 

.570 

0.60 

1 

.087 

.183 

.233 

.310 

.407 

.500 

.550 

.607 

0.66 

1 

.090 

.193 

.263 

.330 

.423 

.493 

.543 

.600 

0.76 

1 

.083 

.173 

.243 

.310 

.370 

.423 

.467 

.547 

0.87 

1 

.113 

.247 

.350 

.437 

.497 

.530 

.570 

.600 

1.00 

1 

.153 

.277 

.370 

.430 

.483 

.527 

.577 

.633 

1.14 

1 

.110 

.233 

.330 

.410 

.480 

.507 

.553 

.597 

1.31 

1 

.133 

.287 

.343 

.407 

.490 

.527 

.557 

.607 

1.51 

1 

.120 

.243 

.303 

.393 

.453 

.487 

.557 

.617 

1.75 

1 

.173 

.313 

.403 

.473 

.547 

.573 

.613 

.653 

2.08 

1 

.233 

.357 

.417 

.510 

.553 

.593 

.630 

.677 

2.57 

1 

.280 

.420 

.503 

.563 

.630 

.683 

.737 

.780 

3.55 

1 

.520 

.687 

.747 

.790 

.810 

.827 

.850 

.873 

Table  3:  Sunmary  of  Results  for  Case  Study  II.  Table  Entry  Represents  The 
Empirical  Probability  Estimate  (fi)  That  Riven  Effect  Is  Declared 
Important  By  Method.  Standard  Error  of  Each  Estimate  Is  Given  By 
tp(l-{i)/300(No.)  ]**  . 


ISl/o 

<  0.66 
0.76 
0.87 
1.00 
1.14 
1.31 
1.51 
1.75 
2.08 
2.57 
3.55 


Table  4: 


No. 


Number  of  Simulations  Number  of  Simulations 

Entered  at  First  Stage _ Entered  at  Second  Stage 


10 

52 

87 

1 

9 

6 

1 

8 

11 

1 

9 

20 

1 

9 

10 

1 

12 

13 

1 

13 

16 

1 

15 

19 

1 

28 

28 

1 

46 

33 

1 

99 

57 

Observed  Counts  for  Entering  (i.e.,  showing  largest  effect) 
at  First  and  Second  Stages  of  SR  Method  for  Case  Study  II. 
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